Real-Time Road Intersection Detection in Sparse Point Cloud Based on Augmented Viewpoints Beam Model

Road intersection is a kind of important navigation landmark, while existing detection methods exhibit clear limitations in terms of their robustness and efficiency. A real-time algorithm for road intersection detection and location in large-scale sparse point clouds is proposed in this paper. Different from traditional approaches, our method establishes the augmented viewpoints beam model to perceive the road bifurcation structure. Explicitly, the spatial features from point clouds are jointly extracted in various viewpoints in front of the robot. In addition, the evaluation metrics are designed to self-assess the quality of detection results, enabling our method to optimize the detection process in real time. Considering the scarcity of datasets for intersection detection, we also collect and annotate a VLP-16 point cloud dataset specifically for intersections, called NCP-Intersection. Quantitative and qualitative experiments demonstrate that the proposed method performs favorably against the other parallel methods. Specifically, our method performs an average precision exceeding 90% and an average processing time of approximately 88 ms/frame.


Introduction
An unmanned ground vehicle (UGV) should find out when and where to turn while navigating in a city.As a result, the intersection becomes a kind of crucial landmark.Intersection detection under normal traffic conditions is a challenging perception task due to the variability in scenarios [1].The current intersection detection methods can be categorized into two main classes: learning-based [2][3][4][5][6] and learning-free [7][8][9][10][11] approaches.The former methods typically combine various types of data sources, including images, point clouds, GPS, and trajectories.However, they tend to struggle in real-life situations when only a single data source is available.On the other hand, the latter approaches primarily rely on the traditional algorithms for enhancing intersection detection based solely on point clouds.The beam model algorithm has been widely employed in road intersection detection due to its remarkable ability to extract spatial features.However, existing algorithms based on the beam model exhibit limitations in terms of robustness when confronted with intersections of varying structures.These limitations stem from the lack of postprocessing methods after obtaining beam model data and the algorithm's inability to perform online learning.To overcome these challenges, this paper proposes a novel lidar-based method to detect and locate intersections.It excels in maintaining real time and robustness when confronted with dynamic road intersection scenarios, as presented in Figure 1.
Learning-based approaches detect the intersections by recognizing scene features, which can quickly identify the characteristics of road boundaries [12][13][14][15][16][17][18][19].Hata [12] identified road intersections by matching the road boundary data with the pre-defined inter-section model.This method is only applicable to structured roads with complete road boundaries.Furthermore, there are also video-based methods [15].However, the actual environment is accompanied by interference, like overexposure or direct sunlight.What is more, there exist road intersection detection methods in the field of remote sensing images [16].These methods detect road intersections through remote sensing images and are not suitable for general mobile robots.That makes these methods difficult to adapt to the environment.Generally, these methods face the lack of local features and bear high computational overhead.In addition, there also exist intersection detection algorithms without machine learning methods.Thrun [20] firstly proposed the beam model and successors developed related methods [7,8,10,11].Wang [11] designed a wave crest search algorithm to extract the angle characteristics of intersection bifurcation from the output of the model.Although the processing time is fast, its detection results completely depend on the current unique detection results, and the reliability is poor.Zhang [10] proposed a sliding beam model to eliminate the dependence on detection distance parameters, while the relationship between adjacent test results is not considered in this method.Wang [9] proposed a double-layer beam model to detect the boundary and shape of the road.However, the robustness of these methods is still poor, and the excellent performance is dependent on structured road conditions.Due to the inability to fully account for the influence of obstacles on detection results, these methods lack robustness in diverse road conditions.
In certain circumstances, such as caves or buildings, data sources often possess only point clouds due to environmental and equipment limitations.Under such conditions, a UGV usually cannot observe the complete intersection shape [21,22].Even when positioned at the center of an intersection, it may be obstructed by various obstacles within the channel.Another problem lies in how the UGV can determine the quality of its intersection detection results [23,24].The above problems bring great challenges to the intersection detection task.Aiming to address the aforementioned issues, this paper introduces a real-time intersection detection method based on lidar.It is capable of robustly detecting and locating intersections, even in sparse point cloud data.
To summarize, the main contributions of this paper are threefold: (1) We propose an augmented viewpoints beam model and design a real-time intersection detection method based on the model.Experiments on VLP-16 and HDL-64 lidar data show that the algorithm works well in real traffic conditions.(2) We design online evaluation metrics to evaluate the quality of the intersection detection results.It enables the UGV to self-assess the quality of the intersection detection in real time during moving.

Method
The flowchart of our intersection detection method is presented in Figure 2. The proposed method takes a raw sparse point cloud as input and outputs the position of intersection as well as the bifurcation angle.It can be divided into three steps: step 1 involves preprocessing point clouds by removing the ground and obstacles; step 2 focuses on intersection detection and location; and step 3 entails an online evaluation of the calculation results.This self-assessment capability allows our method to evaluate the quality of detection results.

Preprocessing
The purpose of preprocessing is to remove the ground and obstacles to reduce the false detection rate.For the initial data processing, we start by isolating the relevant region of interest (ROI).To remove the influence of the ground, the RANSAC method [25] is employed.However, directly fitting the entire ground plane using RANSAC proves challenging due to the irregularities and unevenness of the road surface.As a result, we utilize RANSAC to fit specific areas of the road surface, resulting in a set of fitted screens.Utilizing this set, we calculate the angle between each plane's normal vector in the set and the Z-axis of the lidar.If the angle falls below a certain threshold, it indicates that the plane is part of the ground surface.After the ground detection process, we can separate the ground point clouds from the initial point clouds and remove the interference caused by the ground.
Subsequently, we proceed to distinguish between the ground and non-ground regions on the two-dimensional projection map along the Z-axis of the lidar.Within the ground area, the above-ground point clouds often manifest as a cluster of isolated points.To address this, we employ the Breadth-First Search (BFS) algorithm to cluster these isolated points together.As a result, individual isolated points are grouped into distinct obstacles such as pedestrians, bicycles, or cars, which need to be removed.Finally, the point clouds set P without the ground and obstacles can be achieved.The results of the preprocessing are shown in Figure 3.We mark the obstacles with red boxes in Figure 3b.

Augmented Viewpoints Beam Model-Based Intersection Bifurcation Detection
Algorithm 1 shows the process of intersection bifurcation detection based on the augmented viewpoints beam model.Based on different viewpoints, we calculate n groups of bifurcation angles.We then screen and select the optimal bifurcation angles as the detecting result, guided by several thresholds.In this algorithm, S i (i = 1, 2, . . ., n) saves the bifurcation angles detected by all the beam models, and S f saves the final bifurcation angles of the intersection.add bifurcation angle end if 23: end for

Grid Map-Based Single-Viewpoint Beam Model
In the first step of our method, the point clouds P after removing the ground and obstacles need to be rasterized.Based on the grid width d w = 0.2 m and region of interest (ROI), the initial grid map M can be created.Then, the grids are filtered by checking if there are point clouds present or not, effectively dividing them into occupied and unoccupied grids.This step projects the three-dimensional point clouds onto a two-dimensional plane, preserving all the intersection features while minimizing the computational loss.
In the beam model, the viewpoint is firstly determined as the center of the model.And the beam is defined as the connection between the viewpoint and occupied grids.As shown in Figure 4a, γ section represents the division angle section corresponding to some grid in the occupied grid map.We set γ a = 1 • as the beam ray resolution, which represents the angular spacing of the adjacent beam rays.In grid map M, each beam ray γ grid of the grid m s,t is calculated as follows: where (origin r , origin c ) is the position of the beam model's origin in M. For all the occupied grids m s,t with the same beam ray, the division angle section γ section is defined as: where k ∈ {1, 2, 3, . . ., 360}.At the same time, each beam ray in γ section corresponds to a beam length l k , and the definition is: As is shown in Figure 4a, section γ mul contains multiple occupied grids, and we use the Euclidean distance between the shortest grid and the central grid as the beam length of γ mul .If the section contains only one grid like γ sin , the beam distance is determined by the occupied grid.After normalization, the beam length l k is quantified as a value between 0 and 1.For the division angle section without an occupied grid, l k is recorded as 1.It indicates that there is no occlusion from this angle, and there may exist bifurcation in this direction.The relationship between the beam length and beam ray is shown in Figure 4c.We set the threshold L c to screen out the qualified beam length fragment as the bifurcation angle.

Augmented Viewpoints Beam Model
With a single-viewpoint beam model, we place n virtual viewpoints (p 1 , p 2 , . . ., p n ) in front of the UGV as is shown in Figure 5a.They are applied as the origins to establish multiple beam models in one frame of point clouds.We set two thresholds T W and T L in advance to filter bifurcations.T W represents the minimum angle range covered by a single angle set, and T L indicates the maximum angle difference between two adjacent angle sets.The interval of the viewpoints is d m , and the position of the i-th viewpoint is defined as:

Determining Intersection Center
Based on the bifurcation angle sets, we can figure out the center of the intersection location.The key problem is to determine the distance between the intersection and the UGV.In the robot coordinate system, the position of the center of the intersection refers to the position of the i-th model in front of the UGV.Based on the intersection bifurcation angle set, the bifurcation angle of the same intersection is very close in theory.Therefore, even if there are missed or false detections, the other correct angles still have high similarity.
After the above operations, we can obtain the bifurcation angle set S i in each single beam model and the final intersection bifurcation angle set S f .For every element s ij in S i , we find its closest element s f i in S f .Then, the degree of separation dd i between two bifurcation angle sets is defined as: where |S i | indicates the number of angles in S i .It is applied to reduce the effects of possible false detection angles.Selecting the origin of the beam model with the smallest dd i , the relative position of the intersection center can be achieved.

Confidence Evaluation
In this section, we design a metric to measure the quality of the detection results online.Fully considering the interfering factors of our method, the three aspects below are considered.

The Number of Bifurcations
During the process of intersection detection, false detection will lead to the local jump in the number of intersection bifurcations.Aiming at this problem, we set the confidence c 1 : where m indicates the number of beam models for calculating the correct bifurcation number, and n indicates the total number of beam models.

The Angle of Bifurcations
Based on Equation ( 5), we achieve the smallest degree of separation dd min .It reflects the influence of false detection on the final intersection bifurcation angle set.The higher the value, the less credible the result is.Therefore, we set the confidence c 2 :

Intersection Location Matching
According to the detection properties of our method, the detection results of the robot and the detection distance show a Gaussian distribution.The separation degree and detection distance of the bifurcation angle set can be approximately fitted as an opening-upward quadratic curve.The extreme point will be the best matching point of the intersection position.After calculating the detection distance of the best matching point i min , we compare it with the relative position of the intersection i r .As the two values approach, the detection results will be reliable.The confidence c 3 is designed as Considering the above three aspects as a whole, we weigh and add the above three confidence results.The final evaluation function C(R) is expressed as: where λ 1 , λ 2 , and λ 3 are weighting coefficients.By evaluating the detection results, the optimal intersection detection results can be obtained when the UGV passes through the intersection.If there are obstacles such as pedestrians or bicycles ahead, the detection results with high confidence can be maintained.This will contribute to providing more reliable intersection information for subsequent navigation.

Experimental Data and Environment
The proposed method is verified on two actual scene datasets, including KITTI-raw and NCP-Intersection.In addition, the simulation experiments in Gazebo and Carla are conducted to simulate both indoor and outdoor scenes.A total of 1000 frames are evaluated in this study.The selected data in KITTI-raw contain straight, T-shaped, Y-shaped, and +-shaped intersections, with a total of 200 frames.NCP-Intersection is a dataset we collected and annotated in a campus environment with our robot, which is shown in Figure 6.Due to the impacts of the vertical scanning angle of the lidar and the road width on detection performance, we have placed restrictions on the installation height of the lidar.Because our collected data primarily pertain to campus roads with a width of approximately 6.5 m, we have determined the lidar height from the ground to be 0.72 m.
NCP-Intersection includes straight, T-shaped, +-shaped, and L-shaped intersections, with a total of 500 frames.The point cloud is collected by Velodyne VLP-16 Lidar in NCP-Intersection and is much sparser than the point cloud in KITTI-raw.A DGPS is used to obtain the ground truth location of the intersection.The robots in both Gazebo and Carla are also equipped with Velodyne VLP-16 Lidar.For each simulation environment, we select 150 frames of intersections for further analysis and evaluation.The proposed algorithms in the actual scenes are tested both on an Intel i5-8259U and 8-core ARM CPU, respectively.

Ablation Study of Parameters
The proposed method considers the parameters T W , T L , and T s as crucial factors that influence the detection outcomes.The different combinations of these parameters have a direct impact on the method's performance.To determine the optimal combination, the True Segmentation Rate (TSR) [10] is used as the evaluation metric.Figure 7 and Table 1 present the objective evaluation values obtained from the various combinations tested on the NCP-Intersection dataset.In Figure 7, we conduct a grid search and exhaustively list all the possible values for the three parameters.As shown in Figure 7a, the detection performance is the best in the central area.Therefore, we narrow down the parameter value range, as shown in Figure 7b.Referring to Table 1, it is evident that our method achieves the best performance when T W = 20, T L = 30, and T s = tan49.Consequently, based on this analysis, we adopt T W = 20, T L = 30, and T s = tan49 in the framework.

Results and Analysis
Firstly, we verify the necessity of preprocessing. Figure 8 shows the importance of obstacle removal in accurately detecting the number of bifurcations.Without removing the obstacles, the beam model incorrectly detected the +-shaped intersection as being T-shaped (Figure 8a).The detection result is correct after removing the obstacles and accurately reflects the actual intersection shape (Figure 8b).
Figure 9 shows the statistical results of the bifurcation set detected by the augmented viewpoints beam model.Based on the angle set of the bifurcation in Figure 9c, we find that segments z and | with the slope close to the horizontal correspond to the front and rear bifurcations.The left and right bifurcations correspond to segments y and {.However, the algorithm also detects x by mistake as the bifurcation.With filtering through T s and T g , it considers x as an obstacle interval and eliminates the possibility of it being a bifurcation.To demonstrate the superiority of our proposed method, we conduct a comparative analysis with five parallel approaches that employ enhanced algorithms for the beam model.These approaches include methods proposed by Zhu [8], Chen [7], Zhang [9], Zhang [10], and Wang [11].The approaches proposed by Zhu [8] and Chen [7] employ a single beam model, where a single viewpoint is established to construct the beam model and extract intersection features.Zhang [9] introduces a double-layer beam model to recognize the intersection shape and classify the road type.The method proposed by Zhang [10] utilizes a sliding beam method for road segmentation.Lastly, the approach proposed by Wang [11] extracts edge feature points from the point cloud data using height, flatness, and horizontal error as criteria and then utilizes the beam model to perform intersection detection.
In contrast to the previous method, our approach establishes multiple beam models to minimize false detections.Furthermore, we have designed evaluation metrics to self-assess the quality of the detection results.This allows our method to optimize the detection process in real time.The evaluation is performed on three datasets: KITTI, NCP-Intersection, and the simulation, as illustrated in Figures 10 and 11.To showcase the effectiveness of our method, we select eight frames of representative point cloud images from these datasets.In the figures, the purple points denote the intersection center, while the colored lines indicate different bifurcation directions.In Figure 10, we have selected four frames from the intersection scenarios, namely, straight, Y-shaped, T-shaped, and +-shaped ones.Zhu [8] adopts a fixed distance for detecting intersections using the beam model, which restricts its ability to detect intersections at different locations.Chen [7] utilizes a range finder-based beam model and employs a distance function in relation to the angle of each beam to identify intersections.While this method shows some improvements, its effectiveness is limited in sparse point clouds, which is particularly evident in Column (c) of Figure 10.The other three comparison methods also exhibit shortcomings when confronted with obstacles.In contrast, our proposed method comprehensively accounts for real-world road conditions and maintains robustness across various structures, regardless of the sparsity of the data source.
Figure 11 displays the detection results in both outdoor flat roads and customized interior structures within the simulation environments.It is observed that as the smoothness of the road surface improves, the detection effects become more robust compared to Figure 10.In the absence of obstacles on an open road, the methods of Zhu [8] and Chen [7] perform relatively well in comparison to the other three methods.However, when encountering unconventional road structures, all five comparison methods show instances of false detection.In contrast, our proposed method effectively addresses the impact of environmental transitions on the detection results.Even when there are obstructions present at the intersection, our method can accurately detect and locate the intersections.
When comparing the overall effects of Figures 10 and 11 side by side, we can observe that the algorithms proposed in [8] and [7] perform well in a fixed distance in front of the UGV, but its effectiveness is limited in other scenarios.The algorithms in [10] and [9] tend to prioritize detection results with a higher number of bifurcations, which may introduce biases.The method described in [11] shows good performance in most road conditions, but it still struggles to completely eliminate the influence of obstacles.In contrast, our proposed method demonstrates excellent robustness and accuracy across all the listed road structures.It takes into account the outputs of multiple beam models and achieves precise results across various types of terrain.
In addition, we also set four indicators to evaluate the intersection detection effects, including the running time, averaging detecting distance (Dist), intersection segment rate (ISR), and location failure rate (LFR).The ISR focuses on determining whether the detected intersection bifurcation angle is correct, and the LFR records the probability of location estimation failure.Their definitions are shown in Equation (10).
M is the set of all the sparse point clouds and N a denotes the correct dataset of the detected bifurcation angle results.N t represents the dataset with the correct intersection position.Moreover, commonly used evaluation metrics such as the F 1 score, Precision (Positive Predictive Value, PPV), and Recall (True Positive Rate, TPR) are also applied.PPV denotes the proportion of correctly detected intersections out of all the detected samples.TPR represents the proportion of correctly detected results out of the labeled intersections.The F 1 score is defined as the harmonic average of the PPV and TPR.The definitions are shown as follows:    PPV = TP/(TP + FP) TPR = TP/(TP + FN) where TP is the true positive number and FP is the false positive number (false detection).FN means false negative (missed detection).The detection and location results are shown in Table 2. From the results, we found that while the single beam model in [11] exhibits good real-time performance similar to our method, its detection accuracy is significantly lower compared to our method.The algorithm in [10] is more suitable for the platform with high computing performance and is sensitive to the interference of obstacles.When our method detects an intersection, the UGV is positioned at the farthest distance from the intersection.Despite the point cloud density of KITTI being approximately four times that of NCP-Intersection, the computational time of the proposed algorithm is only 1.3 times when using a grid map.In comparison, our method achieves a faster processing time and performs better across various common road shapes.

Conclusions
This paper presents a novel and efficient lidar-based algorithm for detecting and locating intersections.The proposed algorithm incorporates the augmented viewpoints beam model, which enhances the robustness of intersection detection in real traffic scenarios involving pedestrians and vehicles.Additionally, an evaluating confidence metric is introduced to enable online self-assessment of the detection performance.Due to the limited availability of datasets for intersection detection, we have taken the initiative to collect and annotate a VLP-16 point cloud dataset that is specifically tailored for intersections.The experimental evaluations conducted on three datasets demonstrate the real-time capability of the proposed method on both X86 and ARM CPUs.
As of now, there is a lack of proposed deep learning-based intersection detection methods that solely rely on point cloud data as input.This is primarily due to the variability in point cloud data across different road structures.Nonetheless, deep learning exhibits notable advantages in terms of feature representation ability when compared to traditional methods.Therefore, there is immense potential for enhancing intersection detection methods by leveraging deep learning techniques.Moving forward, our future work aims to further advance intersection perceptual tasks by integrating additional intersection semantic features through deep learning.

Figure 1 .
Figure 1.Intersection detection and location by Velodyne VLP-16.The red arrow indicates the pose of the UGV.

Figure 2 .
Figure 2. Overall structure of proposed method based on sparse point cloud.Our method consists of three main parts: (1) data preprocessing; (2) intersection detection; and (3) evaluation.In the data preprocessing phase, raw point clouds are processed to eliminate ground and obstacles.The intersection detection module utilizes an augmented viewpoints beam model to accurately perceive the road bifurcation structure.It enhances the detection effects by establishing multiple virtual viewpoints in front of UGV as the center of beam model.Finally, factors such as the number, angle, and location of detected intersections are taken into consideration to determine the level of confidence.This self-assessment capability allows our method to evaluate the quality of detection results.

Figure 3 .
Figure 3. Steps of preprocessing.(a) is the raw sparse point cloud data.(b) is the point clouds after removing ground.(c) is the bird's eye grid map M after removing obstacles.

Algorithm 1 1 : 4 : 20 :
figure out bifurcation angle set S i on current beam model

Figure 4 .
Figure 4. Schematic diagram of traditional beam model.(a) is the beam sections projected on the grid map.(b) is the detection effects of single beam model.(c) is the relationship between beam length and corresponding beam ray.There exist several groups of continuous angle regions with beam length of 1.Each group should be a reasonable candidate angle set for bifurcation.

4 )Figure 5 .
Figure 5.The red arrow in (a) represents UGV, and the blue circle indicates the virtual viewpoints in front of UGV.The relationship between bifurcation number and viewpoint id are shown as (b).The sketch map of the augmented viewpoints beam model in single frame of point cloud is shown as (c-e), and green lines indicate the calculated bifurcations.

Figure 6 .
Figure 6.The experimental platform.The vehicle is equipped with a Velodyne VLP-16 Lidar, an inertial measuring unit, and a global navigation satellite system.The lidar is mounted on the top of the vehicle with a height of 72 cm.

Figure 7 .
Figure 7. Results of grid search for three parameters T W , T L , and T s .As depicted in (a), the optimal value of TSR is achieved when T W ∈ (15, 25), T L ∈ (25, 35), and T s ∈ (45, 55).To determine the optimal combination, we further restrict the range of these three parameters and conduct a grid search again, as illustrated in (b).

Figure 11 .
Figure 11.The performance of six algorithms in simulation.Columns (a) and (b) represent the results obtained in Carla environment, while Columns (c) and (d) represent the results obtained in Gazebo

Table 2 .
Average of metrics on three datasets (bold: best).